Using Discretization for Extending the Set of Predictive Features

نویسندگان

  • Avi Rosenfeld
  • Ron Illuz
  • Dovid Gottesman
  • Mark Last
چکیده

To date, attribute discretization is typically performed by replacing the original set of continuous features with a transposed set of discrete ones. This paper provides support for a new idea that discretized features should often be used in addition to existing features and as such, datasets should be extended, and not replaced, by discretization. We also claim that discretization algorithms should be developed with the explicit purpose of enriching a non-discretized dataset with discretized values. We present such an algorithm, D-MIAT, a supervised algorithm that discretizes data based on Minority Interesting Attribute Thresholds. D-MIAT only generates new features when strong indications exist for one of the target values needing to be learned and thus is intended to be used in addition to the original data. We present extensive empirical results demonstrating the success of using D-MIAT on 28 benchmark datasets. We also demonstrate that 10 other discretization algorithms can also be used to generate features that yield improved performance when used in combination with the original non-discretized data. Our results show that the best predictive performance is attained using a combination of the original dataset with added features from a “standard” supervised discretization algorithm and D-MIAT.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)

Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...

متن کامل

A hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts

High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...

متن کامل

An Evolutionary Multi-objective Discretization based on Normalized Cut

Learning models and related results depend on the quality of the input data. If raw data is not properly cleaned and structured, the results are tending to be incorrect. Therefore, discretization as one of the preprocessing techniques plays an important role in learning processes. The most important challenge in the discretization process is to reduce the number of features’ values. This operat...

متن کامل

Design and implementation of a model predictive controller for the COVID-19 spread restraint in Iran

 In this paper, a model is proposed based on the different levels of social restrictions for the COVID-19 spread restraint in Iran. Also, a Genetic Algorithm (GA) identifies parameters of model using reported main data from the Iranian Ministry of Health and simulated data based on proposed model. Whereas Model Predictive Control (MPC) is a popular method which has been widely used in process ...

متن کامل

Model Predictive Control of Distributed Energy Resources with Predictive Set-Points for Grid-Connected Operation

This paper proposes an MPC - based (model predictive control) scheme to control active and reactive powers of DERs (distributed energy resources) in a grid - connected mode (either through a bus with its associated loads as a PCC (point of common coupling) or an MG (micro - grid)). DER may be a DG (distributed generation) or an ESS (energy storage system). In the proposed scheme, the set - poin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EURASIP J. Adv. Sig. Proc.

دوره 2018  شماره 

صفحات  -

تاریخ انتشار 2018